Duration Modeling For Turkish Text-to-Speech Synthesis System
نویسندگان
چکیده
Naturalness of synthetic speech depends on appropriate modeling of prosodic aspects. Mostly, three prosody components are modeled: segmental duration, pitch contour and intensity. In this study, we present our work on modeling segmental duration in Turkish by using machine-learning algorithms. The models predict phone durations based on attributes such as phone identity, neighboring phone identities, lexical stress, position of syllable in word, part-of-speech information, word length in number of syllables and position of word in utterance. Obtained models predict segment durations better than mean duration approximations.
منابع مشابه
A Rule Based Prosody Model for Turkish Text-to-speech Synthesis
Original scientific paper This paper presents our novel prosody model in a Turkish text-to-speech synthesis (TTS) system. After developing a TTS system driven by parametric features consisting of duration, pitch and energy modifications, we try to figure out some prosody rules in order to increase the naturalness of our synthesizer. Since the inflected verbs in Turkish can be stand-alone senten...
متن کاملA Corpus-Based Concatenative Speech Synthesis System for Turkish
Speech synthesis is the process of converting written text into machine-generated synthetic speech. Concatenative speech synthesis systems form utterances by concatenating pre-recorded speech units. Corpus-based methods use a large inventory to select the units to be concatenated. In this paper, we design and develop an intelligible and natural sounding corpus-based concatenative speech synthes...
متن کاملAn implementation and evaluation of two diphone-based synthesizers for Turkish
This paper presents two diphone based Turkish text-to-speech systems; the first system is realized inside the MBROLA project, a freely available multilingual speech synthesizer and the second system is based on shape invariant harmonic modeling. Both synthesizers use the same parametric representations of two diphone databases (male, female) obtained by processing speech data with a pitch async...
متن کاملImplementation and evaluation of a text-to-speech synthesis system for turkish
In this paper, a diphone based Text-to-Speech (TTS) system for the Turkish language is presented. Turkish is the official language of Turkey, where it is the native language of 70 million people and it is also widely spoken in Asia (Azerbaidjain, Uzbekhstan, Kazakhstan, Kirgizhstan and Iran), Cyprus and the Balkans. The research has been done through a visiting internship at CSLR (the Center fo...
متن کاملModeling of geminate duration in an amharic text-to-speech synthesis system
This paper presents analysis and modeling of geminate duration in Amharic Text-to-Speech (AmhTTS) synthesis system. AmhTTS is a parametric and rule-based system that employs a cepstral method. The system uses a source filter model for speech production and a Log Magnitude Approximation (LMA) filter as the vocal tract filter. Fundamental speech units of the system are syllables. Gemination in Am...
متن کامل